Understanding Catastrophic Overfitting in Single-step Adversarial Training

نویسندگان

چکیده

Although fast adversarial training has demonstrated both robustness and efficiency, the problem of "catastrophic overfitting" been observed. This is a phenomenon in which, during single-step training, robust accuracy against projected gradient descent (PGD) suddenly decreases to 0% after few epochs, whereas sign method (FGSM) increases 100%. In this paper, we demonstrate that catastrophic overfitting very closely related characteristic which uses only examples with maximum perturbation, not all direction, leads decision boundary distortion highly curved loss surface. Based on observation, propose simple prevents overfitting, but also overrides belief it difficult prevent multi-step attacks training.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lessons in Neural Network Training: Overfitting

For many reasons, neural networks have become very popular AI machine learning models. Two of the most important aspects of machine learning models are how well the model generalizes to unseen data, and how well the model scales with problem complexity. Using a controlled task with known optimal training error, we investigate the convergence of the backpropagation (BP) algorithm. We find that t...

متن کامل

Understanding Regularization by Virtual Adversarial Training, Ladder Networks and Others

We study a regularization framework where we feed an original clean data point and a nearby point through a mapping, which is then penalized by the Euclidian distance between the corresponding outputs. The nearby point may be chosen randomly or adversarially. A more general form of this framework has been presented in (Bachman et al., 2014). We relate this framework to many existing regularizat...

متن کامل

Stacked Training for Overfitting Avoidance in Deep Networks

When training deep networks and other complex networks of predictors, the risk of overfitting is typically of large concern. We examine the use of stacking, a method for training multiple simultaneous predictors in order to simulate the overfitting in early layers of a network, and show how to utilize this approach for both forward training and backpropagation learning in deep networks. We then...

متن کامل

Adversarial Training for Relation Extraction

Adversarial training is a mean of regularizing classification algorithms by generating adversarial noise to the training data. We apply adversarial training in relation extraction within the multi-instance multi-label learning framework. We evaluate various neural network architectures on two different datasets. Experimental results demonstrate that adversarial training is generally effective f...

متن کامل

Adversarial Training for Sketch Retrieval

Generative Adversarial Networks (GAN) can learn excellent representations for unlabelled data which have been applied to image generation and scene classification. The representations have not yet to the best of our knowledge been applied to visual search. In this paper, we show that representations learned by GANs can be applied to visual search within heritage documents that contain Merchant ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i9.16989